2025.10.06 | 15B小模型追平DeepSeek-R1;渐进蒸馏128 token省八成算力
Description
本期的 15 篇论文如下:
[00:28 ] 🧠 Apriel-1.5-15b-Thinker(Apriel-1.5-15B-Thinker:以小博大实现前沿多模态推理的15B开源模型)
[01:04 ] 🚀 Efficient Multi-modal Large Language Models via Progressive Consistency Distillation(基于渐进一致性蒸馏的高效多模态大模型)
[01:42 ] 🧩 Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition(组合式策略!利用测试时段分布级组合提升基于扩散或流的机器人策略性能)
[02:19 ] 🪞 Self-Improvement in Multimodal Large Language Models: A Survey(多模态大语言模型自我提升综述)
[02:59 ] 🧬 Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents(你的智能体可能误入歧途:自演化大模型智能体中的涌现风险)
[03:38 ] 📊 CoDA: Agentic Systems for Collaborative Data Visualization(CoDA:面向协同数据可视化的智能体系统)
[04:21 ] 🧐 SurveyBench: How Well Can LLM(-Agents) Write Academic Surveys?(SurveyBench:大模型(智能体)写学术综述能有多靠谱?)
[05:06 ] 🔧 REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration(REPAIR:渐进式自适应干预与再融合的鲁棒编辑框架)
[05:53 ] 🔍 OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features(OrtSAE:正交稀疏自编码器揭示原子级特征)
[06:38 ] 🔍 FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents(FocusAgent:轻量级检索器为网页智能体精简冗长上下文的简易高效方案)
[07:14 ] 🎯 Improving GUI Grounding with Explicit Position-to-Coordinate Mapping(基于显式位置-坐标映射的GUI定位改进方法)
[08:05 ] 📏 LSPO: Length-aware Dynamic Sampling for Policy Optimization in LLM Reasoning(LSPO:面向大模型推理的基于长度感知的动态采样策略优化)
[08:45 ] 🤖 WAInjectBench: Benchmarking Prompt Injection Detections for Web Agents(WAInjectBench:面向网页智能体的提示注入攻防基准评测)
[09:19 ] 🍱 Free Lunch Alignment of Text-to-Image Diffusion Models without Preference Image Pairs(无需配对偏好图像即可免费对齐文本到图像扩散模型)
[09:54 ] 🎯 LEAML: Label-Efficient Adaptation to Out-of-Distribution Visual Tasks for Multimodal Large Language Models(LEAML:面向多模态大模型的标签高效分布外视觉任务适配)
<figure>
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递